DOCUMENT RESUME 

TM Oil 059 

Ravelo Hurtado, Nestor E.; Nitko, Anthony J. 

Selection Bias According to a New Model and Four 

Previous Models Using Admission Data from a Latin 

American University. 

[86] 

27p. 

Reports - Evaluative/Feasibility (142) 
MF01/PC02 Plus Postage. 

*Bias; ^College Admission; College Entrance 
Examinations; Foreign Countries; Higher Education; 
*Personnel Selection; Screening Tests; ^Selective 
Admission 

Lottery; University of Oriente (Venezuela); 
*Venezuela 



This paper describes a modified lottery selection 
procedure and compares it with several popular unbiased candidate 
selection models in a Venezuelan academic selection situation. The 
procedure uses modified version of F. S. Ellett's lottery method as a 
means of partially satisfying the principles of substantive fairness. 
Ellett's procedure establishes an upper cut-score and recommends 
acceptance of everyone whose ^est score is at or above the cut-score. 
A second lower cut-score is also established, so that everyone 
scoring at or belov^ this score is rejected. After hiring or admitting 
candidates in the upper group, additional openings are filled by 
those between the upper and lower cut-scores. The modification of 
Ellett's procedure, referred to as the probability level assignment 
model (PLAM), involves division of the score scale between the upper 
and lower cut-scores into several equal-width intervals, within which 
applicants are selected via lottery in proportion to the probability 
for success upon admission or employment. Results from application of 
this method to 272 first-year students at the Universidad de Oriente 
in Cumana, Venezuela, indicate that the PLAM appropriately addresses 
the various criteria of selection fairness. Eight data tables are 
appended. (TJH) 



ED 291 774 

AUTHOR 
TITLE 



PUB DATE 
NOTE 

PUB T7PE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 
ABSTRACT 



************************************************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
********************************************************* 



ERIC 



SELECTION BIAS ACCORDING TO A NEW MODEL 
AND FOUR PREVIOUS MODELS USING ADMISSION 
DATA FROM A LATIN AMERICAN UNIVERSITY 



NESTOR E. RAVELO HURTADO 
University de Oriente 



ANTHONY J. NITKO 
University of Pittsburgh 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



U « OEPAPTTMENT Of EOUCATIOW 

Office oi Educationil ReMirch »f><J tmprovemenr 

EDUCATIONAL RESOURCES I.^FORMATION 
CENTER (ERIC) 




This document has been reproduced as 
I received from the person or OfgSnrzatiOn 
originating it 



C Minor Changes h»ve been made »o improve 

reproduction quality 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



Points of view Of opinions Stated m this dOCu 
ment do not necessarily represent ofttcrai 
OERt position or policy 



unning head: Selection Bias 



I 



• 4 

Selection Bias 



Sglgction BUf AccordlM to • Model mnd 
Pour Previous Itedeli P^Im Adaiggion Pata from 
a Latin Aaerican Univeraitv 



) 



The purpose of this paper is to describe s modified lottery 
selection procedure and to compare it vitb several popular unbiased 
selection models in a Venesuelan academic selection situation* 



Background 



In an interesting analysis several years sgo« Ellett (1977) 
examined tbe etbical concept of fairness in relation to selection 
bias. Ellrtt compared eacb of several 'Sinbiased*' selection models* 
necessary conditions for nonbias against tbe following criteria of 
substantive fairness: (1) everyone vitb tbe same qualifications 
sbould be treated in tbe same manner during selection; (2) if 
tbere are more equally qualified applicants tban tbere are open- 
ings, a lottery from among tbat group sbould be used to fill the 



3 



Selection Bias * 

openings; and (3) averj suitably qualified applicant should have 
the saae probability of being identified as qualified (and, siiii* 
larly, every unqualified applicant should have the sane probability 
of being so identified )• 

Ellett*s review concluded that none of the selection test 
bias sodels (regression, equal risk, constanst ratio, and condi- 
tional probability aodels) reviewed satisfied these substantive 
fairness criteria because the criteria require that a predictor 
test have perfect validity. This pessimistic conclusion was 
reached for the unbiased selection s'jdels reviewed because each 
used a single-cut score, whether the cut-score be within a sub- 
grouping of applicants or in the pooled applicant group. When a 
single cut-score is used that all applicants at or above the cut- 
score are selected, even though som persons so selected have lit- 
tle chance of success, and that all applicants below this cut-score 
are rejected, even though soae pexsons so rejected have considera- 
ble chance of being successful. 

As a practical solution, Ellett propose a aodif ied lottery 
as a way to satisfy partially the principles of substantive fair- 
ness. Under this proposal an upper cut-score is established which 
is sufficiently high that the probability of being at least mini- 
sal ly successful is very high for all those above it. Error of 
measurenent in the predictor test may taken into account in setting 
this cut-score ^y lowering it on the basis of the test's standard 
error of measurement. Once set, everyone whose test score is at or 
above this cut-score is selected. In a similar way, a second. 



ERIC 



4 



Selection Bias 



lover cttt*-score ie ••tabliehed eo that everyone ecorlng at or belov 
thia acore la rejected. 

Job or adatiaaioD openninga are filled by aelecting thoae in 
the upper ti^oup firat* If there are additional openinga, theae are 
filled by lottery froB among thoae applicanta vhoae teat acorea are 
between the vpper and lover cot-acorea« 

The baaia of Ellett*a propoaal ia the concept that appli- 
canta vithin each of the three groupa are eaaentially •quivalent or 
^^exchangeable**. Ellett reachea thia deduction becauae the pre- 
dictor teat lacka perfect validity and ao, except for aeaauresent 
error, peraona vithin each of the three gt^oupa are identical. 

Thia deduction runa counter to our experience vith all but 
the pooreat teata. Even a teat vith leaa than perfect validity ia 
able to diatinguiah validly at leaat the higher froai the lover 
ability applicanta. To the extent that higher acoring applicanta 
in the \iiddle group"* have a higher probability than thoae appli- 
canta vith lover acorea of being at leaat adniaally aucceaaful, to 
that extent vill Ellett* a procedure treat unfairly higher-and 
lover-acoring applicanta from the "Middle-group**. 

Probability Level Aaaianment Model 

A modification of Ellett* a propoaal may lead to an increaae 
in aubatantive faimeaa vhen a aelection teat haa reaaonable, but 
imperfect, predictive validity. Thia modification ia called the 
probability level aaaignment model (PLAM) in thia paper. The proce- 
dure ia outlined in Table 1. 



Selection Bias A 

Ih« nJM tttablithet upper end lover cut^ecoroe in a Banner 
aisllar to Ellett*a proposal. Hovever, the scort scala betveeo 
those tvo cut-scores is dividod into several equal iridtb intervals » 
Vithin these intervals applicants are selected by lottery in pro* 
portion to the conditional probability of those in the interval 
being at least viniaally successful on the criterion^ hence the 
ters probability level assigiment. To assure fairness vitbin 
subpopulations of applicants (e.g.p racial groups)^ the total nus- 
' ber of applicants selected froB that subgroup «ay be proportional 
to the probability of being niniMlly successful in that subpopu- 
lation« 

This probability level assignment model is similar to 
Bereiter's (1975) probability weighted model (PHM). In Bereiter"s 
approach each applicant's name is entered in a lottery in propor- 
tion to the conditional probability of being at least minimally 
successful, lames are then drawn by simple random selection. The 
PHM appears to use a pooled or common regression line and does not 
adjust the lottery on the basis of subpopulation membership. By 
not using stratified random sampling, the PWM does not assure that* 
in any one application of the models there vill be a proportional 
number of applicants selected from particular test score levels. 

The FLAM is related also to the equal probability model 
(EPM) proposed by Peterson and Rovick (1976) since both are based 
on the conditional probability of being minimally successful. The 
EPM qualices the conditional probability across groups whereas the 
PLAM capitalizes on the different conditional probabilities in each 
group to establish a limit on the number of applicants to be selected 



ERIC 



6 



Selection Bias 5 
from each group. Further, the EPM uses a single cut-score within each 
group. From Ellett's viewpoint, the EPM would be less substantively 
fair than is the FLAM for individual applicants. Thuc, the FLAM 
attempts to be fair both to groups and to individuals within groups. 

The FLAM, as other lottery and/or limiting admission models of 
selection, is criticized easily from the perspective of the employer 
(e.g., see Jensen, 1980) since it does not set out to maximize a 
criterion payoff for the employer: More false positive errors will 
be made using the FLAM than when using the regression model, for example. 
On the other hand, for any given degree of predictive validity, fewer 
false negative errors will be made using FLAM than using any of the 
other fair selection models. Further, fewer false positive errors will 
be made using FLAM than using other lottery models. 

Depending on how valid the selection test is within a subpopulation 
of applicants, the consequences of false positive errors may be minimal. 
Further, payoff on the job or grade point average criterion may not be 
the only consideration for an employer or an institution in making 
selection decisions. Subgroup representation, for example, may be a 
desirable goal. One may be willing to sacrifice some criterion payoff 
to attain such representation provided the payoff loss is not severe. 
Similarly, reducing false positives may have less utility for some 
institutions than minimizing false negatives: clearly the FLAM would 
be preferred when the latter is the case. 

The FLAM has been evaluated elsewhere (Ravelo-Hurtado, 1986) in 
terms of mathematical consistency, utility, ethical viewpoint, and 
(using simulated conditions) effect of unreliability. Table 2 summarizes 
this evaluation. 



ERIC 



7 



r 

Selection Bias 



6 



iBPirical CopDAriion of Several 
Selection Modele 

The practical conaequencea of uaing the PLAMvere explored 
enpirically bj comparing it vitb four other vell-knovn fair aelec* 
tion Bodela: the regreaaion aiodel (EM) (aee Cleary, 1968)9 the 
equal riak model (ERM) (aee Einhorn and Baaa, 1971), the conatant 
ratio model (CRM) (aee Thorndike, 1971 ), end the conditional proba- 
bility ratio model (CPM) (aee Cole, 1973). 

Context of the Compariaona 

Sample The aample conaiated of 272 (freahmen) atudenta 
vho had completed the firat aemeater at the Univeraidad de Orientc 
(UDO) in Cumana, Venezuela* Theae atudenta vere admitted under the 
government' a open-admiaaion policy vbich prevailed until 1982* The 
atudenta had taken the Academic Aptitude Teat (AAT) on a voluntary 
baaia along vitb approximately 20»000 cendidatea nationwide vho 
vere applying for admiaaion to varioua oniveraitiea throughout 
Venesuela for the 1980-81 academic year* The AAT vaa not uaed for 
admiaaion deciaiona, but the data vere uaed to eatabliah that norma 
and validity coef f icienta* The atudenta in the aample came from 
throughout Venetuela* Revertheleaa it ia a aelf*aelected aample* 



ERIC 



8 



Selection Bias 



7 



The Predictor The predictor used in this •tu'ly vet the 
Index of Academic Attainment (lAA)* The Central Office of Guidance 
and Admiaaion to Higher Education vaa aet-up by the Rational Coun- 
cil of Univeraitiea (CND) to implement higher education admiaaion 
policiea* To thia end, both AAT acorea and high achool grade point 
analyse (HSGPA) are uaed aa a compoaite predictor called the lAA: 

lAA - .SOCBSGPA) ^ •2S(VR) ^ .25(fik) 

Here, HSGPA « High achool grade point average atandardited 

on a 50 10 acale, 
VR « Verbal Reaaoning aubteat acore the AAT 

expreaaed on a 50 jt 10 acale, and 
HA « Numerical Ability aubteat acore of the AAT 

expreaaed on a 50 i 10 acale 

Verbal Reaaoning ia a 65 item» 75 minute, objective teat meaauring 
vocabulary knowledge, phraae content, and comprehenaion of ideaa in 
context* Numerical Ability ia a 65 item, 75 minute, objective teat 
meaauring knowledge of arithmetic, algebra, and geometry* IWo 
forma of AAT are repreaented in the data for this atudy* The norm 
group reliabilitiea for theae forma are (a) VR .87 (n « 11,252) and 
*86 (n - 10, 187), and (b) HA .85 (n « 11, 252) and .79 (n - 10, 
187). 



ERIC 



9 



Selection Bias 

The Criterion to be Predicted 



8 



The criterion for this study v«6 the student *• college 
grade point average (GPA) after one temetter at the University 
(DDO). Course grades at DDO are on a 10-point scale. 

For purposes of this study ve set the vininally acceptable 
grade point average as 4.5. Students vith CPAs less than 4.5 are 
considered unsuccessful. 

Selection Ratio 

For purposes of this study^ the selection ratio was set at 
71Z or 194 of the 272 students studied. Such a large selection 
ratio represents c;>lleges and universities that are relatively 
nonselective. Eecall^ too^ that before the introduction of the lAA 
admission to Venesuelan colleges and universities vas open^ so a 
71Z selection ratio is not unreasonable in this context. Further » 
setting the selection ratio at 71Z permits a more complete study of 
the effects of each selection model in relation to the FLAM over a 
vide range of ability that applicants bring to the selection situa- 
tion. 

Groups of Students Studied 

Four groups of students were studied as follovs: 
1. Socioecpnomic status (SES) The Central Office for Guidance 
and Admission to Higher Education rates each student on five SES 



ERIC 



10 



Selection Bias 



9 



▼ariablet then m$kt9 a conpoBite index from these ratinge. The 
▼ariablea «re father's occupation^ mother's educational level » 
amount of family income » source of family income » and type of home* 
Each variable is rated on a scale of 1 to 5. vhere 1 is the highest 
8ES rating, then the five ratings are summarised to obtain an SES 
index ranging from 5 through 25. 

For purposes of this study students vere grouped into High 
SES (index scores less than or equal to 17) and Low SES (index 
scores above 17). Those in the Lev SES group constituted the mi- 
nority group. 

2. Gender University records were used to identify a stu- 
dent's gender. Females constituted the minority group. 

3. ^ Typically 9 high school students graduate at age 18. 
Those students vho vere 18 years old or younger at the time of 
their admission to UDO constituted the younger group; those over 
•gc 18 constituted the older group. The older group constituted 
the minority group. 

4. Type of H igh School Attended Students attended either « 
public or private high school. Those attending private high 
schools constituted the minority. 

Table 3 shows the number of students in each of the groups. 
The table also shovs the predictive validity of the lAA vithin each 



ERIC 



11 



Selection Bias 



10 



subgroup* Only the difference between the correl«tione in the 
older and younger groups it etatistically significant at the .03 
level. 

Criteria for Comparing the Models 

There is no single criterion which would satisfy all 
stakeholders in a selection situation^ thus, multiple criteria must 
be used. There are at least three types of stakeholders who have a 
stake in any selection procedure: the individual applicant^ the 
leadership of a particular subgroup of applicants« and the employer 
(or institutional representative). The following criteria were 
used to compare the various fair selection models. 

1. The number and proportion of applicants selected from each 
fubttroup bv each procedure . This criterion is of primary concern 
to subgroup leaders who seek to have more of their subgroup members 
selected and, thereby » give more of them access to social benefits. 
The number of subgroup members selected by any selection procedure 
which encorporates a regression equation in some form depends on 
the predictive validity of the test for the subgroup ami on the 
distribution of the subgroup's scores on the predictor and the 
criterion. 

2. The probability of a potentially success ful applicant with 
^ given test score being selected . This criterion is of primary 
concern to individual applicants who would be at least minimally 



ERIC 



12 



Selection Bias 



11 



•uccettful OD the criterion if they vould be selected. It it ex- 
pected that the FLAM is noet appealing on this criterion, eape* 
cially to those scoring in the middle range on the predictor test^ 
since they nay bave no chance of being selected vhen only a sin- 
gle, high cut'score is used* 

3« The success ratio This criterion is of prinary concern to 
institutional representatives (or 'employers) vho seek to niniaiize 
the proportion of failures among those selected (i.e., those %rfao 
vill perform belov the minimally acceptable level on the crite- 
rion). It is expected that the regression model vill result in the 
highest overall success ratio, but the size of its advantage over 
ojher models vould depend on the predictive validity, the placement 
of the cut-score, and the marginal distributions in any local ap- 
plication. 

4. Means and standard deviations This criterion is of primary 
concern to institutional representatives (or employers) vho are 
concerned vith maximizing the total yield on a continuous criterion 
as a result of the selection process. It is expected that the 
regression model vill maximise the total yield (i.e., result in the 
highest mean criterion score) among those selected, but the size of 
its advantage over other models vould depend on the predictive 
validity and the marginal distributions in any local application. 



13 



Selection Bias 



12 



Bmnbgr and Proportion o f Minority Group Student* 
figleeted wi th Each Model 

Teble 4 shove the number and proportion of applicente se- 
lected from each group when each of the selection models is applied 
in the manner suggested by its author(s). As may be seen, the 
conditional probability model (CPM) selects the largest number of 
applicants from every minority group except the private high school 
group. If each selection model is ranked within each grouping 
(gender, age, etc.) according to number of minority group members 
•elected, and then these ranks are averaged, CPM has the highest 
average rank, PLAM is next, followed by CRM, ERM, and RM in that 
order. 

Probability of a Potentially Succe asful Applicant with a given Ttf t 
fgintt Selected 

Table 5 shows the probability of selecting a potentially 
successful applicant for each predictor test score level. The 
probabilities are displayed for each model and for each way of 
grouping the applicants. The values displayed in the table for the 
PLAM were obtained from empirical results by multiplying the pro- 
portion of applicants selected from a score interval by the propor- 
tion of applicants in that interval who were successful (i.e., who 
bad UDO CPA 4.5). The other models set one-cut score within a 



14 



Selection Bias 



13 



subgroup. Tbu0» the probability of selecting « potentially tuc^ 
cetftful applicant equals one if that applicant's test score is 
greater than or equal to the cut-score and equals cero otherwise. 



Success Ratio 



The success ratio is the proportion of all those selected 
vho are at least niniaally successful. These proportions vere 
obtained by applying each of the selection models as recommended by 
its author (s)» then determining the proportion of selected students 
vithin each subgroup who vere at least minimally successful. These 
results are displayed in Table 6. In addition^ the table shows the 
proportion of all the students selected (n ■ 194) vho vere success*- 
ful. Sincif^ each model selects somewhat different persons in each 
group> the success ratio is not equal over groups within a given 
model. 

As may be seen from the table > the FLAM tends to select the 
lowest proportion of successful applicants overall » while the re- 
gression model tends to select the highest proportion of successful 
applicants. The overall success ratio difference between the RM 
and the FLAM is about .035 in favor of the RM. 

With regard to the success ratio of minority candidates > 
however^ the PLAH is somewhat better. It tended to select a 
slightly higher proportion of successful females and older students 
than the other models. It selected the smallest proportion of 
successful high 8ES students and was tied for third in the success 



15 



Selection Bias 



14 



ratio for students from private schools • Differences betveen the 
vrtdel vith the highest success ratio snong ninority groups and the 
nodel vith the lowest success ratio tend to be about .071 • 



Means and Standard Deviations 



Table 7 shows the means and standard deviations on the 
criterion variable (CPA) of those persons selected by each of the 
models. As may be seen the regression model selects those with the 
highest average criterion score, while the FLAM selects those with 
the lowest average criterion score. The differences among the 
models are small, however, with the maximum difference being around 
•10 or 7Z of the standard deviation of the criterion scores of 
those selected by the regression model. 



Conclusions 



1. From a minority viewpoint the CPM is the most appealing 
model in this study because, in three of the four cases, a larger 
number of minority applicants were selected than when one of the 
other models was used. The FLAM was the next more appealing model 
for xinority groups because when it was used, a larger number of 
minority applicants were selected tbii.n when one of the other models 
was used. Nevertheless, the more appealing model to a minority 
group (using the number of applicants selected as a criterion) is 



16 



Selection Bias 

th€ lets «ppc«liiig Bodel to the corresponding Mjority group. Tbus^ 
the KM end ERM verc the nodels that vere note appealing to the 
najority groups. 

2. The FLAM was the aiost attractive aodel for potentially sue* 
cessful applicants, regardless of group aenbership, beci^use it gave 
a chance of selection to a larger range of potentially successful 
applicants than the other nodels. The other models vere very simi- 
lar to each other and quite <!ifferent fron the FLAM on this point. 

3. Fron the viewpoint of utility to the institution, the RM, 
ERMr. and CRM vere the preferred aodels because they led to a larger 
success ratio than either the CFM or the FLAM. In addition, for 
those selected by using the RM, ERM, or OtM, the mean criterion 
score vas consistently higher and the standard deviation lover than 
for the CFM and the FIAM. Hovever, the FLAM made the errors of 
selection sore likea between groups and this could encourage an 
institution to sacrifice a small benefit in order to have a more 
substantively fairer selection for all the groups. 

Summary 

As is to be expected in selection situations, as veil as in 
other social control situations, different stakeholders seek to 
optimise different criteria. Leaders of groups seek admission to 
social benefits for more members of their groups. Individuals vho 
perceive themselves as being qualified vould like at least an equal 



ERLC 



17 



Selection Bias 



16 



chance uf being selected at other peracna they perceive to be 
equally qualified, A selecting institution say seek to naxinise 
the success ratio aud/or total criterion gain. Different models 
address these different criteria and so are optinally appealing to 
the self-interest of various stakeholders. Perhaps a conpromise 
between the stakeholders can be arranged. If so, the FLAM vould 
seem to be the model that best implements this compromise. The 
differences betveen the results obtained in applying it and the 
optimal results, at least in this application, appear to be small 
and vithin tolerable limits of vfaat various parties can expect to 
give up in order to reach a negotiated settlement. 



18 



Selection Bias 



17 



Eefgrencet 

Bereiter^ C. (1975). Individual Uat ion and inequality (Reviev of 
A. R. Jensen^ Educational Diffcrencea. 1974). Contmporarv 
P»TcholoET . 20 (6), 455-457. 

Cleary» T. A. (1968). Teat Biae: Prediction of grades of Regro 
and Vbite students in integrated colleges. Journal of Educa- 
tional Heasurement ^ 115-124. 

Cole» R. 8. (1973). Biaa in selection^ Journal of Educational 
Maaaureiient ^ Ifi (4), 237-255. 

Einhorn» H. J.» and Bass» A. R. (1971). Methodological considera- 
tions relevant to discriaination in enploynent testing, gay- 
chological Bulletin . Jl (^)» 261-269. 

Ellettt F. 8. (1977). Fairness of college adsiasions procedures: 
A criticisB of certain vicva. (Doctoral Disaertationy Cornell 
UniverSm, L977). Diasertation Abatract International, 1978. 
38. 7204A (Dniveraity Microfilms, Order Ro. 7807747). 

Jensen, A. R. (1980). Biaa in mental tasting . Rev York: 
MacMillan Publishing Co. 

Peterson, R. 8., and Hovick, R. M. (1976). An evaluation of sone 
models for culture-fair selection. Journal of Educational 
Meaaurement. 12 (1), 3-29. 

Ravelo-Hurtado, R. E. (1986). Development of a nw aelection bias 
modal .and comnariaon vith four existing model a naina admiasion 
4ata from the Conaeio Racional De Hniveraidades in Venezuela. 
(Doctoral dissertation. University of Pittsburgh, 1986). 

Thorndike, R. L. (1971). Concepts of cul tare-fairness. Journal 
of Educational Measurement ,. £ (2), 63-70. 



ERIC 



19 



Selection Bias 

Table 1. Siumnary of Steps Used to Implement the Probability Level Assignment 
Model (PLAN) 

(1) Establish the minimum success level (score) on the criterion, Y . 

M 

(2) For each subgroup determine the lower cut-score, X^^, using the within 

subgroup Y-on-X regression. All candidates with X £ Xj^ are rejected. 

(a) the lower cut-score corresponds to the maximum risk of a 
false negative, a , where o is specified with regard to 

J-f Li 

the conditional distribution of y|x^ within a subgroup. That 
is, for each subgroup « 

(t> Frequently, it is appropriate to set o = .05. 

Li 

(c) The procedure for finding the X^^-value is similar to that 
outlined by Elnhorn and Bass (1971). 

(3) Determine for each subgroup, j, the conditional probability of bel ig at 
least minimally successful given a predictor test score above X^ . 

That is, ^ 

Pj * PIY > Y^|X > Xj^] 

Bivarlate normal curve tables may be used for this purpose. 

(A) For each subgroup determine the number of vacancies to be filled from 
that subgroup. 

(a) This number is proportional to the probability of being at least 
minimally successful given that one has been accepted: 



proportion selected in 
subgroup j 



P[Y > Y^|X > X^]^ 

I PlY|Y„|x>Xj^]j 
j=l 



(b) Bivarlate normal curve tables may be used for this purpose. 

(c) Usually, J « 2 (e.g., males and females). 



ERIC 



20 



Selection Bias 
Table 1 (continued) 



(5) 



For each subgroup determine the upper cut-score, Xy, using the within 

subgroup Y-on-X regression in a manner similar to Step 2 above. 

(a) The upper cut-score corresponds to the maximum risk of a false 
positive, tty, specified in the conditional distribution of 

Y|Xy within a subgroup. 



(b) may be set to .05 

(6) Using a proportional selection as determined in Step 3, fill as many 
of the vacancies as possible (perhaps, all vacancies) with candidates 
who have X >^ X^. 

(7) If more vacancies remain to be filled after selecting those with 

XL ^» the modified lottery described below to select candidates 

in proportion to the within group conditional probability of success, 
beginning with the upper score levels of each subgroup. 0 

(a) Divide the test score scale between X^^ and Xy into intervals 
(e.g., each interval may be .5o in width). 

A 

(b) For each interval, calculate the conditional probability of 
success given one's score is in the interval. That is, 

PlY > Y^lx^ < X < X^^^] 

where X^ and X^^j^ are the lower and upper boundaries of the 
interval 

(c) Allocate the number of candidates to be selected from each 
interval in proportion to this within group conditional 
probability. 

(d) Randomly select this number of candidates from each interval and 
for each of the J subgroups. 



21 



Selection Bias 



Table 2. Sutnmary Evaluation of the Probability Level Assignment Model 


Aspect 


Evaluation 


Mathematical 
Consistency 


Logically consistent within subgroups for all the 
groups, since the converse model and the PLAN lead 
to the same conclusion. 


Utility 


1. Has a utility for the group that is proportional 
to the conditional probability of being successful. 

2. Has lover utility for the employer when the 
number of vacancies is large and the highly qualified 
are few. 

3. Has equal utility for applicants within a subgroup 
who have equal probability of success. 


Effect of 
Unreliability 
of the Predictor 


Sometimes affects, but only slightly, the test cutoff 
scores; however, the number of applicants to be 
selected from each subgroup remains the same. 


Effect of 
Unreliability 
of the 
Criterion 


1. Favors the minority, the majority, or neither 
one according to the case studied. 

2. The difference between groups in percentage and 
number of applicants selected run from 0 6. 



Ethical 1. It is a quota model because vacancies are assigned 

Viewpoint in proportion to a predetermined criterion and because 

it takes into account any probability of success 

found in the group. 

2. The necessary condition for achieving an unbiased 
selection is linked to substantive fairness principles. 

3. Qualified and unqualified applicants are treated 
differently according to their criterion scores if their 
predictor scores are extremely high or extremely low. 

4. Qualified and unqualified applicants are treated 
equally according to their criterion performance 
scores if the predictor scores are between the low 
and high cutoffs. 



Source: Ravelo-Hurtado (1986) 



92 



Selection Bias 

Table 3. Correlation and Regression Di.ta for the UDO Sample 



Group 



Total sample 



lAA 

Mean 

(SD) 



Predictive 
Validity 



272 50.03 (7.55) .51 



Slope 



.09612 



Intercept 



-.03493 



Male 
Female 



149 50.15 (7.41) .58 
123 49.88 (7.75) .43 



.10*^74 
.08386 



-.48333 
.46703 



Younger (< 18 yrs) 185 52.18 (7.04) .52* .10181 -.27810 

Older (>^ 19 yrs) 87 45.44 (6.52) .28* .06021 1.45811 



Low SES (>_ 18 yrs) 86 48.19 (7.00) .58 

High SES i<_ 17 yrs) 186 50.88 (7.40) .45 



.11208 
.08523 



.94576 
.57976 



Public High School 218 
Private High School 54 



49.19 (7.24) .47 
53.39 (7.92) .54 



.09105 
.09374 



,13932 
,3785? 



♦This difference is significant at the .05 level (z = 2.19). All other 
differences are nonsignificant. 



ERIC 



93 



Selection Bias 



Table 4. Number and Percent of Applicants Selected From Each Group 
According to the RM, ERM, CRM, CPM, and FLAM 



Group 


RM 


ERM 


CRM 


CPM 


PLAM 


Male 


109 


(93) 


111 


(75) 


114 


(77) 


105 


(70) 


112 (75) 


Fenale 


85 


(69) 


83 


(67) 


80 


(65) 


89 




82 (67) 


Younger 


163 


(88) 


159 


(86) 


149 


(81) 


128 




144 (78) 


Older 


31 


(36) 


35 


(40) 


45 


(52) 


66 


(76) 


50 (57) 


Low SES 


41 


(48) 


41 


(48) 


48 


(56) 


59 


(69) 


52 (60) 


High SES 


153 


(82) 


153 


(82) 


146 


(78) 


135 


(73) 


142 (76) 


Public 


144 


(66) 


143 


(66) 


143 


(66) 


155 


(71) 


144 (66) 


Private 


50 


(93) 


51 


(94) 


51 


(94) 


39 


(72) 


50 (93) 



Note : (1) The numbers in parentheses are percentages. 

(2) RM « regression models ERM « equal risk model, CRM « constant 
ratio model, CPM « conditional probability model, PLAM « 
probability level assignment model. 



9A 



Selection Bias 



Table 5. 



Chances of Selecting a Potentially Successful Applicant, at Each 
Te.' t Score Level Under the RM, ERM, CRM, CPM, and FLAM, According 
to Gender, Age, Socioeconomic Status, and High School Type 



RM 



l AA LE V EL M 

65.5-72.5 
58.5-65.5 
51.5-58.5 
A4.5-51.5 1/0 
37.5-44.5 .00 
30.5-37.5 .00 
23.5-30.5 .00 



1/0 
.00 
.00 
.00 



0 



65.5-72.5 
58.5-65.5 
51.5-58.5 
44.5 51.5 
37.5-44.5 
30.5-37.5 
23-5-30.5 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1.00 1/0 



.00 
.00 
.00 



.00 
.00 
.00 



H 



65.5-72.5 
58.5 65.5 
51.5-58.5 
44.5-51.5 
37.5-44.5 
30.5-37.5 
23.5-30.5 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1.00 
.00 1/0 



.00 
.00 



.00 
.00 



Pu 



65.5 -72.5 
58.5-65.5 
51.5-58.5 
44.5-51.5 
37.5-44.5 
30.5 37.5 
23.5-30.5 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1.00 
.00 1/0 



.00 

00 



.00 
.00 



ERM 



MODEL 
CRM 



CPM 



1.00 1.00 
1.00 1.00 
1.00 1.00 



M 



GENDER 
M F 



M 



1.00 1.00 
1.00 1.00 



1.00 
1/0 
.00 
.00 
.00 



1.00 
1/0 
.00 
.00 
.00 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1/0 
.00 .00 
.00 .00 
.00 .00 



1.00 1.00 
1.00 1.00 



1.00 
1/0 
.00 
.00 
.00 



1.00 
1/0 
.00 
.00 
.00 



AGE 
Y 0 



1. 
1. 

1.00 
1/0 
.00 
.00 
.00 



00 1.00 
00 1.00 
1.00 
1/0 
.00 
.00 
.00 



1.00 1.00 
1.00 1.00 



1.00 
1/0 
.00 
.00 
.00 



1. 00 
1/0 
.00 
.00 
.00 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1.00 
.00 1/0 
.00 .00 
.00 .00 



SOCIOECONOMIC STATUS 
H L H 



H 



I. 00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1.00 
.00 1/0 
.00 .00 
.00 .00 



1.00 l.CO 
1.00 1.00 
1.00 1.00 
1/0 1/0 



1.00 1.00 
1.00 1.00 
1.00 1.00 



.00 
.00 
.00 



.00 
.00 
.00 



1/0 
.00 
.00 
.00 



1/0 
.00 
.00 
.00 



HIGH SCHOOL TYPE 
Pu Pr Pu Pr 



Pr 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1.00 
.00 1/0 
.00 .00 
.00 .00 



1.00 1.00 
1.00 1.00 



1.00 
1/0 
.00 
.00 
.00 



1.00 
1.00 
1/0 
.00 
.00 



1.00 1.00 
1.00 1.00 
1.00 1.00 
1/0 1/0 
.00 .00 
.00 .00 
.00 .00 



PLAM 



M 



1.00 1.00 
1.00 1.00 
.74 .54 



.40 
.13 
.03 
.00 



.31 
.14 
.05 
.00 



0 



1.00 1.00 
1.00 1.00 
.68 .42 



.37 
.13 
.00 
.00 



.29 
.17 
.07 
.00 



H 



1.00 
1.00 
.63 
.28 
.09 
.02 
.00 



1.00 
1.00 
.65 
.39 
.19 
.08 
.00 



Pu Pr 
1.00 1.00 



1.00 
.62 
.29 
.13 
.04 
.00 



00 
00 
57 
29 
00 
00 



irfnt.iill ^^JhJ"''^"'" probabilities of both 1 and 0 within 

an interval. This happens when the cutting point falls within the interval. 

1, T, vw " "5 ' female; Y,0 = younger, older; L, H = lower, higher; 

Pu, Pr » public, private. "XBuci , 



Selection Bias 

Table 6. Success Ratio for Each Group and Overall According to the RM, 
ERM, CRM. CPM. and PLAM 



Group 


^1 


ERM 


CRM 


CPM 


PLAM 


Male 


.6881 


.6757 


.6667 


.6952 


.6250 


Female 


.6000 


.6024 


.6000 


.5730 


.6098 


Difference 


.0881 


.0733 


.0667 


.1222 


.0152 


Gender Overall 


.6495 


.6443 


.6392 


.6340 


.6186 


Younger 


.6564 


.6541 


.6846 


.6953 


.6458 


Older 


.5806 


.5429 


.5333 


.4697 


.6098 


Difference 


.0758 


.1112 


.1513 


.2256 


.0360 


Age Overall 


.6443 


.6340 


.6495 


.6186 


.5876 


Low SES 


.6341 


.6341 


.5833 


.5085 


.5962 


High SES 


.6601 


.6601 


.6712 


.6667 


.6338 


Difference 


-.0260 


-.0260 


-.0879 


-.1582 


-.0376 


SES Overall 


.6546 


.6546 


.6495 


.6186 


.6237 


Public H.S. 


.6042 


.6014 


.6014 


.5806 


.5764 


Private H.S. 


.7600 


.7451 


.7451 


.8205 


.7600 


Difference 


-.1558 


-.1437 


-.1437 


-.2399 


-.1836 


School Overall 


.6432 


.6392 


.6392 


.6289 


.6237 



Note : The overall success ratios are not equal for the same model because the 
applicants selected (n«=194) are not the same for different groups (classification 
variables). The overall ratios for SES and high school type are equal for the 
PLAM only by chance. 



?6 



Selection Bias 



Table 7* College CPA Mean and Standard Deviation for Each Group for 
Applicants Selected Under the RM, ERM, CRM, CPM, and FLAM 



RM ERM CRM CPM PLAM 

Group Mean SD Mean SD Mean SD Mean SD Mean SD 



Male 


5.21 


1. 29 


5.16 


1.33 


5.14 


1.32 


5.22 


1.30 


5.06 


1.39 


Female 


4.93 


1.43 


4.93 


1.44 


4.96 


1.45 


4.84 


1.46 


4.92 


1.51 


Dlff. 


.28 


-.14 


.23 


-.11 


.18 


-.13 


.38 


-.16 


.14 


-.12 


Overall 


5 09 


1.36 


.5.06 


1.38 


5.07 


1.37 


5.05 


1.38 


5.00 


1.44 


Younger 


5.15 


1.37 


5.16 


1.38 


5.24 


1.37 


5.32 


1.39 


5.15 


1.41 


Older 


4 .67 


1.29 


4.62 


1.24 


4.54 


1.35 


4.32 


1.40 


4.26 


1.22 


Diff. 


.48 


.08 


.54 


.14 


.70 


.02 


1.00 


-.01 


.89 


.19 


Overall 


5.08 


1.37 


5.06 


1.37 


5.08 


1.39 


4.98 


1.47 


4.91 


1.43 


Low SES 


5.07 


1.40 


5.07 


1.40 


4.95 


1.39 


4.71 


1.49 


4.89 


1.53 


High SES 


5.09 


1. 33 


5.09 


1.33 


5.14 


1.34 


5.15 


1.37 


5.01 


1.39 


Dlff. 


-.02 


.07 


-.02 


.07 


-.19 


.05 


-.44 


.12 


-.12 


.14 


Overall 


5.09 


1.35 


5.09 


1.35 


5.09 


1.35 


5.01 


1.42 


4.98 


1.43 


Public 


4.92 


1.35 


4.92 


1.36 


4.92 


1.36 


4.84 


1.38 


4.85 


1.40 


Private 


5.44 


1.39 


5.41 


1.39 


5.41 


1.39 


5.77 


1.25 


5.45 


1.37 


Diff. 


-.52 


-.04 


-.49 


-.03 


-.49 


-.03 


-.93 


.13 


.60 


.03 


Overall 


5.05 


1.38 


5.05 


1.38 


5.05 


1.38 


5.02 


1.40 


5.01 




1.41 



Note ; The overall means and standard deviations are not equal for the same 
model because the applicants selected (n'194) are not the same for different 
groups (classification variables) • 



?7 



